In [19]:
import pandas as pd
import seaborn as sns
import plotly.express as px

import matplotlib.pyplot as plt
In [20]:
import plotly.io as pio
pio.renderers.default = "plotly_mimetype+notebook"

Matplotlib¶

For this excercise, we have written the following code to load the stock dataset built into plotly express.

In [21]:
stocks = px.data.stocks()
stocks.head()
Out[21]:
date GOOG AAPL AMZN FB NFLX MSFT
0 2018-01-01 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
1 2018-01-08 1.018172 1.011943 1.061881 0.959968 1.053526 1.015988
2 2018-01-15 1.032008 1.019771 1.053240 0.970243 1.049860 1.020524
3 2018-01-22 1.066783 0.980057 1.140676 1.016858 1.307681 1.066561
4 2018-01-29 1.008773 0.917143 1.163374 1.018357 1.273537 1.040708

Question 1:¶

Select a stock and create a suitable plot for it. Make sure the plot is readable with relevant information, such as date, values.

In [22]:
# YOUR CODE HERE
plt.figure(figsize=(12, 9))

fig, ax = plt.subplots()
ax.plot(stocks.date, stocks.FB)
ax.set_title('Facebook Stock')
ax.set_xlabel('Date')
ax.set_ylabel('Stock Value')
ax.xaxis.set_major_locator(plt.MaxNLocator(5))

plt.show()
<Figure size 1200x900 with 0 Axes>

Question 2:¶

You've already plot data from one stock. It is possible to plot multiples of them to support comparison.
To highlight different lines, customise line styles, markers, colors and include a legend to the plot.

In [23]:
# YOUR CODE HERE
fig = plt.figure(figsize=(12, 9))
fig, ax = plt.subplots()

date = stocks.date
goog = stocks[['GOOG']]
ax.plot(date, goog, color='blue', label='GOOG')

aapl = stocks[['AAPL']]
ax.plot(date, aapl, color='orange', label= 'AAPL')

amzn = stocks[['AMZN']]
ax.plot(date, amzn, color='green', label='AMZN')

fb = stocks[['FB']]
ax.plot(date, fb, color='red', label='FB')

nflx = stocks[['NFLX']]
ax.plot(date, nflx, color='purple', label='NFLX')

msft = stocks[['MSFT']]
ax.plot(date, aapl, color='brown', label='MSFT')

ax.xaxis.set_major_locator(plt.MaxNLocator(5))

ax.set_title('Stock')
ax.set_xlabel('Date')
ax.set_ylabel('Stock Value')

plt.legend()
plt.show()
<Figure size 1200x900 with 0 Axes>

Seaborn¶

First, load the tips dataset

In [24]:
tips = sns.load_dataset('tips')
tips.head()
Out[24]:
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4

Question 3:¶

Let's explore this dataset. Pose a question and create a plot that support drawing answers for your question.

Some possible questions:

  • Are there differences between male and female when it comes to giving tips?
  • What attribute correlate the most with tip?
In [25]:
# YOUR CODE HERE

#How is the correlation between day and eaiting time?
g = sns.FacetGrid(tips, col='time', hue='sex')
g.map(sns.scatterplot, 'day', 'total_bill')
g.add_legend()
plt.show()

Plotly Express¶

Question 4:¶

Redo the above exercises (challenges 2 & 3) with plotly express. Create diagrams which you can interact with.

The stocks dataset¶

Hints:

  • Turn stocks dataframe into a structure that can be picked up easily with plotly express
In [26]:
# YOUR CODE HERE
#fig = px.line(stocks, x='date', y=['GOOG', 'AAPL', 'AMZN', 'FB', 'NFLX', 'MSFT'], markers=)
#fig.show()

import plotly.graph_objects as go

fig = go.Figure()
fig.add_trace(go.Scatter(x=stocks.date, y=stocks.AAPL, mode='lines+markers', marker_symbol='circle', name='AAPL'))
fig.add_trace(go.Scatter(x=stocks.date, y=stocks.AMZN, mode='lines+markers', marker_symbol='diamond', name='AMZN'))
fig.add_trace(go.Scatter(x=stocks.date, y=stocks.FB, mode='lines+markers', marker_symbol='square', name='FB'))
fig.add_trace(go.Scatter(x=stocks.date, y=stocks.GOOG, mode='lines+markers', marker_symbol='bowtie', name='GOOG'))
fig.add_trace(go.Scatter(x=stocks.date, y=stocks.NFLX, mode='lines+markers', marker_symbol='star', name='NFLX'))
fig.add_trace(go.Scatter(x=stocks.date, y=stocks.MSFT, mode='lines+markers', marker_symbol='x', name='MSFT'))

fig.update_layout(xaxis_title='date', yaxis_title='value')
fig.show()

The tips dataset¶

In [27]:
# YOUR CODE HERE
fig = px.scatter(tips, x='total_bill', y='tip', facet_col='smoker', facet_row='time', color='sex')
fig.show()

Question 5:¶

Recreate the barplot below that shows the population of different continents for the year 2007.

Hints:

  • Extract the 2007 year data from the dataframe. You have to process the data accordingly
  • use plotly bar
  • Add different colors for different continents
  • Sort the order of the continent for the visualisation. Use axis layout setting
  • Add text to each bar that represents the population
In [28]:
#load data
df = px.data.gapminder()
df.head()
Out[28]:
country continent year lifeExp pop gdpPercap iso_alpha iso_num
0 Afghanistan Asia 1952 28.801 8425333 779.445314 AFG 4
1 Afghanistan Asia 1957 30.332 9240934 820.853030 AFG 4
2 Afghanistan Asia 1962 31.997 10267083 853.100710 AFG 4
3 Afghanistan Asia 1967 34.020 11537966 836.197138 AFG 4
4 Afghanistan Asia 1972 36.088 13079460 739.981106 AFG 4
In [29]:
# YOUR CODE HERE
df_2007 = df.query('year== 2007')
df_2007_continent = df_2007.groupby('continent').sum()
df_2007_continent.reset_index(inplace=True)
fig = px.bar(df_2007_continent, x='pop', y='continent', orientation='h', color='continent', text='pop')
fig.update_yaxes(categoryorder='total ascending')
fig.update_traces(textposition='outside', texttemplate='%{x:.2s}', textfont_size=14)
fig.show()
In [ ]: